--- redirect_from: - "/features/notebooks/intersubject-correlation" interact_link: content/features/notebooks/Intersubject_Correlation.ipynb kernel_name: python3 kernel_path: content/features/notebooks has_widgets: false title: |- Intersubject Correlation pagenum: 7 prev_page: url: /features/notebooks/Functional_Alignment.html next_page: url: /features/notebooks/Intersubject_RSA.html suffix: .ipynb search: isc data et al values synchrony www com correlation phase brain correlations activity j however s distribution doi null using pairwise signals between science article participants intersubject distributions sciencedirect pii also example e org similar same used roi thus method resting state t regions functional connectivity real into isfc m not participant temporal signal shared within average mean during statistics similarity timecourse frequency subjects p lahnakoski individuals measures chen approach matrix value sig n c while stimulus individual iscs lets end surrogate different synchronization shows summary level pairs original tests coefficient based increases function plot show glerean timepoints angular such comment: "***PROGRAMMATICALLY GENERATED, DO NOT EDIT. SEE ORIGINAL FILES IN /content***" ---
Intersubject Correlation

Intersubject Correlations

Written by Juha Lahnakoski and Luke Chang

Synchrony between individuals happens at several levels from behavior to brain activity (Nummenmaa et al., 2018, Nastase et al., 2019). To an observer, synchrony during interaction or joint motion (Hale et al., 2020) can reflect prosocial qualities such rapport (Miles et al., 2009) or affiliation (Hove and Risen, 2009). During physically arousing ritualistic experiences, observers may selectively synchronize their heart rates while observing related, but not unrelated individuals participating in the ritual (Konvalinka et al., 2011) and the degree of affective synchrony can predict the perceived social connection between two individuals (Cheong et al., 2020). Synchrony of brain activity is associated with, among other things, shared psychological perspectives toward a stimulus (Lahnakoski et al., 2014, Yeshurun et al., 2017, Yeshurun et al., 2017, Kang & Wheatley, 2017, Cheong et al., 2020) and friendship (Parkinson et al., 2018), and may also be disturbed in psychiatric conditions ranging from developmental conditions such as autism (Hasson et al., 2009; Salmi et al., 2013, Byrge et al., 2015) to more acute conditions such as first-episode psychosis (Mäntylä et al., 2018). Thus, measures of synchrony can offer a simple window to many psychological processes.

In brain imaging, synchrony of brain activity is most commonly measured using intersubject correlations (ISC; Hasson et al., 2004). As the name implies, this method calculates linear correlations between participants and derives summary statistics from these correlations to measure the level of similarity of brain activity. Overall, the brain activity measured with fMRI during naturalistic stimulation conditions can be thought to consist of for main sources: (1) stimulus-driven brain activity between individuals shared by all or most of the participants, (2) individual or idiosyncratic activity elicited by the stimulus, (3) intrinsic activity that is not time-locked to the stimulus, and (4) noise from various sources. The idea behind ISC is to identify brain activity that is shared by a large proportion of the participants (category 1). Thus, this method evaluates how much of an individual’s brain activity is explained by this shared component. By contrast, if smaller groups of participants, e.g. pairs of friends within the study, share similar individual activity patterns (category 2), it may be better captured by the dyadic values in the pairwise matrices (Parkinson et al., 2018, Chen et al., 2020, Finn et al., 2020). Generally, the third category of activity is not readily detected by synchrony approaches (Chang et al., 2018). However, with inventive experimental designs, e.g. during verbal recall of previously experienced stimuli (Chen et al., 2017), it is still possible to extract shared brain activity patterns by temporally reorganizing the data, even when the original experiences of the participants were out of sync. The optimal choice of analysis will depend on the research question and the type of shared activity patterns that are of particular interest. Most commonly, ISCs are calculated locally within each voxel (or region), but this approach has also been extended to functional connectivity (Simony et al., 2016). Intersubject correlations give a summary statistic of synchrony over long periods of time, usually over an entire imaging session. However, the level of synchrony may change considerably from one moment to the next with the experimental condition. We will also introduce time-varying measures of synchrony.

In this tutorial, we will cover:

  • how to compute pairwise ISC
  • how to perform hypothesis tests with ISC
  • how to perform dynamic ISC

Let's get started by watching some short videos illustrating how ISC can be used with naturalistic data.

Dr. Yaara Yeshurun, PhD, an Assistant Professor at Tel-Aviv University, will provide two example applications of how ISC can be used to characterize the brain mechanisms underlying shared understanding.

from IPython.display import YouTubeVideo

YouTubeVideo('pvYNjG1jvQE')
YouTubeVideo('wEKKTG7Q1DQ')

Computing Intersubject Correlation

Pairwise vs Average Response

Generally, ISCs are calculated using one of two main approaches. First, one calculates pairwise correlations between all participant pairs to build a full intersubject correlation matrix. The second approach, uses the average activity timecourse of the other participants as a model for each individual left out participant. This procedure produces individual, rather than pairwise, spatial maps of similarity reflecting how similarly, or “typically”, each person’s brain activates. These maps lend themselves to similar analyses as one might perform with first level results of a traditional general linear model (GLM) analysis. However, some of the individual variability is lost with the latter approach, and thus the ISC values are typically of much higher magnitude compared to the pairwise matrices.

Summary Statistic

The next step after computing either the pairwise or average similarity across participants is to summarize the overall level of synchrony across participants. A straightforward approach is to compute the mean correlation. However, to make the correlation coefficients more normally distributed across the range of values, the Fisher’s Z transformation (inverse hyperbolic tangent) is typically applied before computing the mean correlation. The Fisher’s Z transformation mainly affects the higher end of absolute correlation values, eventually stretching the correlation coefficient 1 to infinity. However, with the typical range of values of pairwise ISCs, the effects of this transformation are relatively small reaching ~10% at the higher end of the scale of r=0.5. More recently, it has been suggested that computing the median, particularly when using the pairwise approach, provides a more accurate summary of the correlation values (Chen et al., 2016).

Hypothesis tests

Performing hypothesis tests that appropriately account for the false positive rate can be tricky with ISC because of the dependence between the pairwise correlation values and the inflated number of variables in the pairwise correlation matrices. Though there have been proposals to use mixed-effects models for a parametric solution (Chen et al., 2017), we generally recommend using non-parametric statistics when evaluating the significance of these correlations.

There are two general non-parametric approaches to performing hypothesis tests with ISC. The first method is a permutation or randomization method, achieved by creating surrogate data and repeating the same analysis many times to build an empirical null distribution (e.g. 5-10k iterations). However, to meet the exchangeability assumption, it is important to consider the temporal dependence structure. Surrogate data can be created by circularly shifting the timecourses of the participants, or scrambling the phases of the Fourier transform of the signals and transforming these scrambled signals back to the time domain (Theiler et al., 1992, Lancaster et al., 2018). Various blockwise scrambling techniques have also been applied and autoregressive models have been proposed to create artificial data for statistical inference . These approaches have the benefit that, when properly designed, they retain important characteristics of the original signal, such as the frequency content and autocorrelation, while removing temporal synchrony in the data.

To illustrate these permutation based approaches, the animation below depicts the process of creating these null distributions and compares these to a similar distribution built based on real resting state data of the same duration in the same participants recorded just prior to the movie data in the same imaging study. Resting state is an ideal condition for demonstrating the true null distribution of intersubject correlation as it involves no external synchronizing factors apart from the repeating noise of the scanner gradients, which are generally not of interest to us. Thus, any correlations in the resting state data arise by chance. As can be seen, the null distributions based on the surrogate data follow the distribution of resting state ISCs well as the number of iterations increases.

null_distribution

The second approach employs a subject-wise bootstrap on the pairwise similarity matrices. Essentially, participants are randomly sampled with replacement and then a new similarity matrix is computed with these resampled participants. As a consequence of the resampling procedure, sometimes the same subjects are sampled multiple times, which introduces correlation values of 1 off the diagonal. Summarizing the ISC using the median can minimize the impact of these outliers. Python implementations in Brainiak and nltools convert these values to NaNs by default so that they are not included in the overall ISC summary statistic. If you would like to learn more about resampling methods, we encourage you to read our brief introduction available on the dartbrains course.

ISC Tutorial

Ok, now that we have covered the basics of ISC analysis, let's see how this works with real data. In this tutorial, we will be computing ISC on average activity within 50 ROIs using data from the Sherlock dataset using the nltools Python toolbox. We also recommend checking out the Brainiak tutorial, and the Matlab ISC-Toolbox (Kauppi et al., 2014).

We have already extracted the data for you to make this easier and have written out the average activity within each ROI into a separate csv file for each participant. If you would like to get practice doing this yourself, here is the code we used. Note that we are working with the hdf5 files as they load much faster than the nifti images, but either one will work for this example.

for scan in ['Part1', 'Part2']:
    file_list = glob.glob(os.path.join(base_dir, '*', 'func', f'*crop*{scan}*hdf5'))
    for f in file_list:
        sub = os.path.basename(f).split('_')[0]
        print(sub)
        data = Brain_Data(f)
        roi = data.extract_roi(mask)
        pd.DataFrame(roi.T).to_csv(os.path.join(os.path.dirname(f), f"{sub}_{scan}_Average_ROI_n50.csv" ), index=False)

Let's load the modules we will be using for this tutorial. Make sure that you changed the base_dir path to wherever you have downloaded the sherlock dataset.

%matplotlib inline

import os
import glob
import numpy as np
from numpy.fft import fft, ifft, fftfreq
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from nltools.data import Brain_Data, Adjacency
from nltools.mask import expand_mask, roi_to_brain
from nltools.stats import isc, isfc, fdr, threshold, phase_randomize, circle_shift
from nilearn.plotting import view_img_on_surf, view_img
from sklearn.metrics import pairwise_distances
from sklearn.utils import check_random_state
from scipy.stats import ttest_1samp
import networkx as nx

base_dir = '/Volumes/Engram/Data/Sherlock/fmriprep' # change this to your path to the data
/Users/lukechang/anaconda3/lib/python3.7/site-packages/sklearn/utils/deprecation.py:143: FutureWarning: The sklearn.linear_model.base module is  deprecated in version 0.22 and will be removed in version 0.24. The corresponding classes / functions should instead be imported from sklearn.linear_model. Anything that cannot be imported from sklearn.linear_model is now part of the private API.
  warnings.warn(message, FutureWarning)

Ok, now let's download the k=50 whole brain meta-analytic parcellation of the neurosynth database (de la Vega, 2016) from neurovault. Each ROI is indicated with a unique integer. We can expand this mask into 50 separate binary masks with the expand_mask function.

mask = Brain_Data('http://neurovault.org/media/images/2099/Neurosynth%20Parcellation_0.nii.gz')
mask_x = expand_mask(mask)

mask.plot()

Now, let's load the csv files for each participant. Remember, Sherlock viewing was separated into two 25 minute runs. We will need to combine these separately for each participant. There are many different ways to do this step. In this example, we will be storing everything in a dictionary.

sub_list = [os.path.basename(x).split('_')[0] for x in glob.glob(os.path.join(base_dir, '*', 'func', '*Part1*csv'))]
sub_list.sort()

sub_timeseries = {}
for sub in sub_list:
    part1 = pd.read_csv(os.path.join(base_dir, sub, 'func', f'{sub}_Part1_Average_ROI_n50.csv'))
    part2 = pd.read_csv(os.path.join(base_dir, sub, 'func', f'{sub}_Part2_Average_ROI_n50.csv'))
    sub_data = part1.append(part2)
    sub_data.reset_index(inplace=True, drop=True)
    sub_timeseries[sub] = sub_data

Now, let's pick a single ROI to demonstrate how to perform ISC analyses. We can pick any region, but let's start with the vmPFC (roi=32), feel free to play with different regions as you work with the code.

We can plot the mask and create a new pandas DataFrame that has the average vmPFC activity for each participant.

roi = 32

mask_x[roi].plot()

sub_rois = {}
for sub in sub_timeseries:
    sub_rois[sub] = sub_timeseries[sub].iloc[:, roi]
sub_rois = pd.DataFrame(sub_rois)

sub_rois.head()
sub-01 sub-02 sub-03 sub-04 sub-05 sub-06 sub-07 sub-08 sub-09 sub-10 sub-11 sub-12 sub-13 sub-14 sub-15 sub-16
0 3.362605 -1.967253 -0.243505 2.527032 5.166227 -0.678549 2.199253 -1.646883e+00 0.421235 0.500547 0.361623 4.639737e+00 1.490442 1.806639 1.039467 3.483579e-13
1 0.995695 1.730923 1.552836 1.068784 4.066954 0.117737 3.184899 8.464993e-01 -0.118011 0.981400 -0.069505 2.522244e+00 1.145760 -0.582861 -0.420722 -1.237187e-13
2 2.084567 -1.940155 1.914897 1.103097 2.168681 0.030628 2.036096 1.782011e-01 0.984125 3.957482 -0.792416 1.326291e+00 0.472309 -3.066318 0.869296 -1.931528e-02
3 -0.217049 -0.636084 1.501459 -0.701397 1.704406 0.042397 2.353035 1.088203e+00 1.650786 3.687806 3.839885 2.105321e-02 -2.885314 -1.212683 1.213115 -1.460159e+00
4 -2.628723 1.650023 -1.196258 0.079026 1.297944 -0.743593 1.188282 3.375227e-13 1.515944 -0.709527 4.874887 2.279356e-13 -5.277045 0.232831 1.914874 1.745742e+00

Circle Shift Randomization

To perform ISC we will be using the nltools.stats.isc function. There are three different methods implemented to perform hypothesis tests (i.e., circular shifting data, phase randomization, and subject-wise bootstrap). We will walk through an example of how to run each one.

The idea behind circular shifting the data is to generate random surrogate data that has the same autoregressive and temporal properties of the original data (Lancaster et al., 2018). This is fairly straightforward and involves randomly selecting a time point to become the new beginning of the timeseries and then concatenating the rest of the data at the end so that it has the same length as the original data. Of course, there will potentially be a sudden change in the data where the two parts were merged.

To demonstrate how this works and that the circular shifted data has the same spectral properties of the original data, we will plot one subject's time series and shift it using the nltools.stats.circle_shift function. Next to both timeseries we plot the coefficients from a fast fourier transform.

sub = 'sub-02'
sampling_freq = .5

f,a = plt.subplots(nrows=2, ncols=2, figsize=(15, 5))
a[0,0].plot(sub_rois[sub], linewidth=2)
a[0,0].set_ylabel('Avg Activity', fontsize=16)
a[0,1].set_xlabel('Time (TR)', fontsize=18)
a[0,0].set_title('Observed Data', fontsize=16)

fft_data = fft(sub_rois[sub])
freq = fftfreq(len(fft_data), 1/sampling_freq)
n_freq = int(np.floor(len(fft_data)/2))
a[0,1].plot(freq[:n_freq], np.abs(fft_data)[:n_freq], linewidth=2)
a[0,1].set_xlabel('Frequency (Hz)', fontsize=18)
a[0,1].set_ylabel('Amplitude', fontsize=18)
a[0,1].set_title('Power Spectrum', fontsize=18)

circle_shift_data = circle_shift(sub_rois[sub])
a[1,0].plot(circle_shift_data, linewidth=2, color='red')
a[1,0].set_ylabel('Avg Activity', fontsize=16)
a[1,0].set_xlabel('Time (TR)', fontsize=16)
a[1,0].set_title('Circle Shifted Data', fontsize=16)

fft_circle = fft(circle_shift_data)
a[1,1].plot(freq[:n_freq], np.abs(fft_circle)[:n_freq], linewidth=2, color='red')
a[1,1].set_xlabel('Frequency (Hz)', fontsize=18)
a[1,1].set_ylabel('Amplitude', fontsize=18)
a[1,1].set_title('Circle Shifted Power Spectrum', fontsize=18)

plt.tight_layout()

Now that you understand how the circular shifting method works with a single time series, let's compute the ISC of the Sherlock viewing data within the vmPFC roi with 5000 permutations. The function outputs a dictionary that contains the ISC values, the p-value, the 95% confidence intervals, and optionally returns the 5000 bootstraps. All of the permutation and bootstraps are parallelized and will use as many cores as your computer has available.

stats_circle = isc(sub_rois, method='circle_shift', n_bootstraps=5000, return_bootstraps=True)

print(f"ISC: {stats_circle['isc']:.02}, p = {stats_circle['p']:.03}")
ISC: 0.074, p = 0.0002

Phase Randomization

Next we will show how phase randomization works. This method projects the data into frequency space using a fast fourier transform and randomizes the phase and then projects the data back into the time domain (Theiler et al., 1992, Lancaster et al., 2018). Similar to the circular shifting method, this generates a random surrogate of the data, while maintaining a similar temporal and autoregressive structure as the original data. We will generate the same plots from the example above to illustrate how this works using the nltools.stats.phase_randomize function.

sub = 'sub-02'
sampling_freq = .5

f,a = plt.subplots(nrows=2, ncols=2, figsize=(15, 5))
a[0,0].plot(sub_rois[sub], linewidth=2)
a[0,0].set_ylabel('Avg Activity', fontsize=16)
a[0,1].set_xlabel('Time (TR)', fontsize=18)
a[0,0].set_title('Observed Data', fontsize=16)

fft_data = fft(sub_rois[sub])
freq = fftfreq(len(fft_data), 1/sampling_freq)
n_freq = int(np.floor(len(fft_data)/2))
a[0,1].plot(freq[:n_freq], np.abs(fft_data)[:n_freq], linewidth=2)
a[0,1].set_xlabel('Frequency (Hz)', fontsize=18)
a[0,1].set_ylabel('Amplitude', fontsize=18)
a[0,1].set_title('Power Spectrum', fontsize=18)

phase_random_data = phase_randomize(sub_rois[sub])
a[1,0].plot(phase_random_data, linewidth=2, color='red')
a[1,0].set_ylabel('Avg Activity', fontsize=16)
a[1,0].set_xlabel('Time (TR)', fontsize=16)
a[1,0].set_title('Phase Randomized Data', fontsize=16)

fft_phase = fft(phase_random_data)
a[1,1].plot(freq[:n_freq], np.abs(fft_phase)[:n_freq], linewidth=2, color='red')
a[1,1].set_xlabel('Frequency (Hz)', fontsize=18)
a[1,1].set_ylabel('Amplitude', fontsize=18)
a[1,1].set_title('Phase Randomized Power Spectrum', fontsize=18)

plt.tight_layout()
stats_phase = isc(sub_rois, method='phase_randomize', n_bootstraps=5000, return_bootstraps=True)

print(f"ISC: {stats_phase['isc']:.02}, p = {stats_phase['p']:.03}")
ISC: 0.074, p = 0.0002

As you can see the ISC value is identical as the median of the pairwise correlations are identical. The p-values are also similar and likely reflect the limited precision of the possible p-values that can be computed using only 5,000 permutations. For greater precision, you will need to increase the number of permutations, but remember that this will also increase the computational time.

Subject-wise Bootstrapping

The final approach we will illustrate is subject-wise bootstrapping of the pairwise similarity matrix . This is currently the recommended approach as it has been demonstrated to accurately control false positive rates and is almost an order of magnitude faster shuffling the similarity matrix rather than recomputing the pairwise similarity for the null distribution (Chen et al., 2016). Bootstrapping and permutation tests are different types of resampling statistics (see our resampling tutorial for a more in depth overview). Bootstrapping is typically used more for generating confidence intervals around an estimator, while permutation tests are used for performing hypothesis tests. However, p-values can also be computed using a bootstrap by subtracting the ISC from the null distribution and evaluating the percent of samples from the distribution that are smaller than the observed ISC (Hall et al., 1991).

Just like the examples above, we will illustrate what an example bootstrapped similarity matrix looks like. Notice how some subjects are repeatedly resampled, which means that there are multiple values of perfect correlations found off the diagonal. This should be accounted for by using the median summary statistic of the lower triangle. However, both Brainiak and nltools toolboxes convert these values to NaNs to minimize the impact of these outliers on the summary statistic.

def bootstrap_subject_matrix(similarity_matrix, random_state=None):
    '''This function shuffles subjects within a similarity matrix based on recommendation by Chen et al., 2016'''
    
    random_state = check_random_state(random_state)
    n_sub = similarity_matrix.shape[0]
    bootstrap_subject = sorted(random_state.choice(np.arange(n_sub), size=n_sub, replace=True))
    return similarity_matrix[bootstrap_subject, :][:, bootstrap_subject]


similarity = 1 - pairwise_distances(pd.DataFrame(sub_rois).T, metric='correlation')

f,a = plt.subplots(ncols=2, figsize=(12, 6), sharey=True)
sns.heatmap(similarity, square=True, cmap='RdBu_r', vmin=-1, vmax=1, xticklabels=False, yticklabels=False, ax=a[0])
a[0].set_ylabel('Subject', fontsize=18)
a[0].set_xlabel('Subject', fontsize=18)
a[0].set_title('Pairwise Similarity', fontsize=16)

sns.heatmap(bootstrap_subject_matrix(similarity), square=True, cmap='RdBu_r', vmin=-1, vmax=1, xticklabels=False, yticklabels=False, ax=a[1])
a[1].set_ylabel('Subject', fontsize=18)
a[1].set_xlabel('Subject', fontsize=18)
a[1].set_title('Bootstrapped Pairwise Similarity', fontsize=16)
Text(0.5, 1.0, 'Bootstrapped Pairwise Similarity')
stats_boot = isc(sub_rois, method='bootstrap', n_bootstraps=5000, return_bootstraps=True)

print(f"ISC: {stats_boot['isc']:.02}, p = {stats_boot['p']:.03}")
ISC: 0.074, p = 0.0002

The bootstrap procedure tends to run much faster than the permutation methods on our computer, which is one of the reasons that this is the recommended approach beyond effectively controlling the false positive rate (Chen et al., 2016).

Null Distributions

Since each of our examples saved the null distribution, we can plot a histogram of the null distribution from each method including the confidence intervals. Notice that the circle shift and phase randomization methods produce highly similar null distributions and confidence intervals, while the bootstrap method has a wider and less symmetric distribution. However, the observed ISC of 0.074 (red line) exceeds all of the samples from the null distribution yielding a very small p-value. Notice that our observed null distributions from the Sherlock data are very similar to the animation of null data presented above.

plt.figure(figsize=(12,6))
sns.distplot(stats_boot['null_distribution'] - stats_boot['isc'], kde=True, label='Bootstrap')
sns.distplot(stats_circle['null_distribution'], kde=True, label='Bootstrap')
sns.distplot(stats_phase['null_distribution'], kde=True, label='Bootstrap')
plt.ylabel('Frequency', fontsize=18)
plt.xlabel('ISC Values (correlation)', fontsize=18)
plt.title('ISC Null Distribution', fontsize=20)
plt.axvline(stats_boot['isc'], linestyle='-', color='red', linewidth=4)
plt.legend(['Observed ISC', 'Bootstrap Null','Circle Shift Null', 'Phase Randomize Null'], fontsize=18)

plt.axvline(stats_boot['ci'][0] - stats_boot['isc'], linestyle='--', color='blue')
plt.axvline(stats_boot['ci'][1] - stats_boot['isc'], linestyle='--', color='blue')

plt.axvline(stats_circle['ci'][0], linestyle='--', color='orange')
plt.axvline(stats_circle['ci'][1], linestyle='--', color='orange')

plt.axvline(stats_phase['ci'][0], linestyle='--', color='green')
plt.axvline(stats_phase['ci'][1], linestyle='--', color='green')
<matplotlib.lines.Line2D at 0x7fa242268c10>

Whole brain ISC

Now, let's calculate ISC looping over each of the 50 ROIs from the whole-brain meta-analytic parcellation (de la Vega, 2016). Here we loop over each ROI and grab the column from each subject's dataframe. We then run ISC on the combined subject's ROI timeseries using the median method and compute a hypothesis test using the subject-wise bootstrap method with 5000 samples (Chen et al., 2016). Finally, we convert each correlation and p-value from each region back into a Brain_Data instance.

isc_r, isc_p = {}, {}
for roi in range(50):
    sub_rois = {}
    for sub in sub_timeseries:
        sub_rois[sub] = sub_timeseries[sub].iloc[:, roi]
    stats = isc(pd.DataFrame(sub_rois), n_bootstraps=5000, metric='median', method='bootstrap')
    isc_r[roi], isc_p[roi] = stats['isc'], stats['p']
isc_r_brain, isc_p_brain = roi_to_brain(pd.Series(isc_r), mask_x), roi_to_brain(pd.Series(isc_p), mask_x)

Now we can plot the ISC values to visualize which regions had a higher overall ISC.

isc_r_brain.plot(cmap='RdBu_r')
view_img(isc_r_brain.to_nifti())

We can threshold using bonferroni correction (p < 0.001 for k=50 parcellation). Alternatively, we can threshold using false discovery rate, by setting thr=fdr(isc_p_brain.data). In this example, FDR is more conservative than bonferroni.

view_img_on_surf(threshold(isc_r_brain, isc_p_brain, thr=.001).to_nifti())

Intersubject functional connectivity

Functional connectivity reflects the similarity of activity timecourses between pairs of regions and it has been particularly effective in characterizing the functional architecture of the brain during resting state, in the absence of task or external stimulation. Given enough data, these resting state correlations are stable within an individual and perturbations to the correlations between regions caused by external stimuli or tasks are relatively small (Gratton et al., 2018). Moreover, functional connectivity signals may be more effective at identifying unique functional connectivity patterns compared to resting state scans (Vanderwal et al., 2017).

To specifically address how brain regions coactivate due to naturalistic stimulation, ISC was recently extended to intersubject functional connectivity (ISFC) to measure brain connectivity between subjects (Simony et al., 2016). This method has the benefit of identifying connections that are activated consistently between participants by the stimulus while disregarding the intrinsic fluctuations as they are not time-locked between individuals. ISFC can illustrate how distant brain regions cooperate to make sense of the incoming stimulus streams. However, it can also highlight pairs of regions that show similar temporal activity patterns that are driven by the external stimulus rather than neural connections between the regions, which should be taken into account in the interpretation.

Compute average functional connectivity

In this tutorial, we will demonstrate how to perform ISFC using the averaging method. We iterate over each subject and compute the cross-correlation between each of the target subject's ROIs with the average ROI response of the other subjects. This yields a separate ROI by ROI ISFC matrix for each subject. We use the nltools.stats.isfc function, but we encourage the interested reader to check out the Brainiak implementation for a faster and more feature rich option.

We plot the average of these matrices as a heatmap. The diagonal reflects the ROI's ISC using the averaging method. You might recall that we used the pairwise method in the previous example to compute ISC for each ROI. Off diagonal values reflect the average intersubject functional connectivity (ISFC) between each ROI.

data = list(sub_timeseries.values())

isfc_output = isfc(data)

sns.heatmap(np.array(isfc_output).mean(axis=0), vmin=-1, vmax=1, square=True, cmap='RdBu_r', xticklabels=False, yticklabels=False)
plt.title('Average ISFC', fontsize=20)
plt.xlabel('ROI', fontsize=18)
plt.ylabel('ROI', fontsize=18)
Text(100.40000000000006, 0.5, 'ROI')

We can threshold the ISFC connectivity matrix by running a one-sample ttest on each ISFC value and correcting for multiple comparisons using FDR.

We can also convert this into an adjacency matrix, by binarizing the continuous t-values. In this example, we specifically are interested in exploring which regions have a positive ISFC. We use an arbitrary fdr threshold (q < 0.000001) in this example to create a sparse adjacency matrix.

t, p = ttest_1samp(np.array([x.reshape(-1) for x in isfc_output]), 0)
thresh = fdr(p, .0000001)
thresholded_t_pos = t.copy()
thresholded_t_pos[p > thresh] = 0
thresholded_t_pos[thresholded_t_pos <= 0] = 0
thresholded_t_pos[thresholded_t_pos > 0] = 1
thresholded_t_pos = np.reshape(thresholded_t_pos, isfc_output[0].shape)

sns.heatmap(thresholded_t_pos, square=True, xticklabels=False, yticklabels=False)
plt.title('Positive ISFC Edges', fontsize=20)
plt.xlabel('ROI', fontsize=18)
plt.ylabel('ROI', fontsize=18)
Text(100.40000000000006, 0.5, 'ROI')

We can now convert this adjacency matrix into a graph and can visualize which regions are functionally connected to the most other regions.

def plot_network(data):
    '''Plot the degree of the thresholded isfc Adjaceny matrix'''
    
    if not isinstance(data, Adjacency):
        raise ValueError('data must be an Adjacency instance.')
        
    plt.figure(figsize=(20,15))
    G = data.to_graph()
    pos = nx.kamada_kawai_layout(G)
    node_and_degree = G.degree()
    nx.draw_networkx_edges(G, pos, width=3, alpha=.4)
    nx.draw_networkx_labels(G, pos, font_size=14, font_color='darkslategray')

    nx.draw_networkx_nodes(G, pos, nodelist=list(dict(node_and_degree).keys()),
                           node_size=[x[1]*100 for x in node_and_degree],
                           node_color=list(dict(node_and_degree).values()),
                           cmap=plt.cm.Reds_r, linewidths=2, edgecolors='darkslategray', alpha=1, with_labels=True)
    
plot_network(Adjacency(thresholded_t_pos, matrix_type='similarity'))

Each ROI number is not particularly informative. Most of the time, we find it helpful to project the number of connections (i.e., degree) with each node back into brain space.

degree = pd.Series(dict(Adjacency(thresholded_t_pos, matrix_type='similarity').to_graph().degree()))
brain_degree = roi_to_brain(degree, mask_x)
brain_degree.plot()
view_img_on_surf(brain_degree.to_nifti())

Temporal dynamics of intersubject synchrony

The majority of research has focused on correlations calculated over entire datasets to reveal the average connectivity or intersubject similarity in a group of participants. However, functional brain networks are widely thought to be dynamic. Thus, it is not clear whether tools like correlation, which assumes a constant statistical dependence between the variables over the entire imaging session, are the most appropriate way to analyze data gathered during complex naturalistic stimulation.

A simple, and popular way of looking at temporal variability of synchrony while limiting the effects of signal amplitudes is to calculate correlations within sliding time windows. This allows the estimation of synchrony also during time windows when the signals are close to their mean values as the amplitude within each time window is standardized when the correlation is calculated. However, the length of the temporal window forces a trade-off between temporal accuracy and stability of the correlation coefficient calculated in that window. Very short time windows allow one to follow precisely when correlations occur, but short windows yield extremely unstable correlations, which can be dominated completely by e.g. single co-occurring spikes or slopes. Thus, sliding-window correlations in short time-windows are often characterized by extreme correlation values that change signs wildly.

Calculating the phase synchronization or phase locking of signals is one option to estimate time-varying synchronization of signals in a way that largely separates synchronization from signal amplitudes. It has been used widely for electrophysiological measures such as EEG and MEG, and more recently also for fMRI (Glerean et al., 2012). Phase synchronization leverages the Hilbert transform to transform the real-valued signals into a complex valued, analytic signal, which is a generalization of the phasor notation of sinusoidal signals that are used widely in engineering applications. For illustration, two examples of analytic signals with constant frequency and amplitude are shown in the bottom panel of video 2, plotted in three dimensions (real, imaginary and time axes). In contrast to a time-invariant phasor, an analytic signal has a time-varying amplitude envelope and frequency and can thus be used to track changes in synchrony over time. However, for meaningful separation of the envelope and phase of the signal, the original signal has to be contained in a limited frequency band, which can be obtained through band-pass filtering. The smaller this frequency band is, the better the amplitude envelope is separated into a lower frequency than the phase of the signal in the pass-band (Glerean et al., 2012). However, poorly designed filters may affect the shape of the signal considerably and in extreme cases even remove the signal of interest. For example, some filters can cause non-linear phase shifts across the frequency spectrum, or excessively tight pass-band may miss important frequencies completely.

Compared to sliding-window correlations, phase synchronization has the benefit that no explicit time windowing is required and synchronization is estimated at the original sampling frequency of the signals. However, in a single pairwise comparison, phase synchrony can get extreme values by chance, much like a stopped clock that shows the right time twice a day. This is illustrated in the bottom panel of Video 2. Despite the two signals being independent, the histogram on the left shows that many of the time points show extreme synchrony values. Accordingly, the estimate of mean synchrony oscillates with the phase of the signals, until eventually stabilizing around zero as expected for independent signals. Thus, phase synchrony of two signals has the potential of producing extreme synchrony values much like time windowed correlations. This can be mitigated by averaging. Averaging over the timepoints of a full session, intersubject phase synchronization (ISPS) of regional activity produces highly similar group-level results to ISC. However, this removes the benefit of the temporal accuracy of ISPS and is thus of limited use. By contrast, averaging over (pairs of) subjects improves the reliability of synchrony in a larger population while retaining the temporal accuracy.

Note that, here, we have used the cosine of the angular difference as a measure of pairwise synchrony. This produces time-averaged values that are consistent with the ISCs in the regions. However, there are other measures that can also be derived, such as the phase locking value, that is the absolute value of the mean of the phase vectors of the participants (Glerean et al., 2012). This is simpler than pairwise measures, and can be beneficial particularly where memory is of concern, but it is limited in the range [0 1] rather than [-1 1] as for correlation coefficients.

null_distribution

Statistics for time-varying measures of synchrony largely follow a similar non-parametric approach as discussed above. However, depending on the values that are used, statistics of phase synchrony differ slightly from linear measures of synchrony. In particular, circular statistics can be used to estimate the parametric statistics of phase differences on the unit circle, as wrapping of the phase angle causes the angular difference to change signs as the signals rotate around the circle. Thus, angular difference of –pi and pi are equivalent on the unit circle, but are at the different ends of a linear axis. However, in the current examples, we have used cosines of the angular differences, which is an even function and thus is indifferent to the sign of the angular difference. It also naturally maps the angular difference to a linear range between -1 and 1 similar to the correlation coefficient, thus simplifying the interpretation. If the sign of the phase shift is of interest, such as when trying to infer the temporal precedence of one signal over another, alternative measures taking into account the direction of the phase shift may be preferable.

What about the phase synchrony, would we also have a practical example for that and if so, can I do anything to help? As I briefly mention, I used cosines in the example video, which is different from our paper that I cite, where we had just normalized angular differences and average phase vectors. If we add a practical example, we could use either the cosines or the angles. Cosine is, of course, more consistent with how the slope in correlation is calculated, so the scaling is also more similar to correlation than with angles. This, I think, is helpful if people want to compare the results. However, I have no strong preference on which one to show as they are essentially the same thing.

To create the signals and the synchrony timecourse in the example video, I used the lines below in Matlab for ISPS that can be a applied to the ROI data with a few edits (replacing the sin(t) and sin(t1.5) with real signals) and translating them to Python: t=linspace(0,14pi,750); sig=hilbert(sin(t)); sig2=hilbert(sin(t1.5)); sig(1:29)=[]; sig2(1:29)=[]; t(1:29)=[]; sig(end-29:end)=[]; sig2(end-29:end)=[]; t(end-29:end)=[]; sync=cos(angle(exp(1i(angle(sig)-angle(sig2)))));

Also the "angle(exp(1*" is unnecessay as the cosine of the angular difference would produce the same result, so it can be simplified a bit to "sync=cos(angle(sig)-angle(sig2))", I believe.

Intersubject Phase Synchrony

  • wavelet (or hilbert) over frequencies
  • within each frequency extract phase
  • within each frequency calculate mean vector length as a metric of synchrony

$\abs{ \frac{1}{n} \sum{e^{i \theta}}}$

from scipy.signal import hilbert, butter, filtfilt, lfilter, firwin
from scipy.stats import rayleigh

def isps(data, sampling_freq=.5, low_cut=.04, high_cut=.07, order=5):
    '''Compute Dynamic Intersubject Phase Synchrony
    
    This function computes the instantaneous intersubject phase synchrony for a single voxel/roi
    timeseries. Requires multiple subjects. This method is largely based on that described by Glerean
    et al., 2012 and performs a hilbert transform on narrow bandpass filtered timeseries (butterworth)
    data to get the instantaneous phase angle. The function returns a dictionary containing the
    average phase angle, the average vector length, and the rayleigh statistic using circular
    statistics (Fisher, 1993).
    
    This function requires narrow band filtering your data. As a default we use the recommendations
    by (Glerean et al., 2012) of .04-.07Hz. This is similar to the "slow-4" band (0.025–0.067 Hz)
    described by (Zuo et al., 2010; Penttonen & Buzsáki, 2003), but excludes the .03 band, which has been
    demonstrated to contain aliased respiration signals (Birn, 2006).
    
    Though this function return mean phase angle and p-values, we primarily recommend focusing
    on ISPS(i.e., vector_length).

    Birn RM, Smith MA, Bandettini PA, Diamond JB. 2006. Separating respiratory-variation-related
    fluctuations from neuronal-activity- related fluctuations in fMRI. Neuroimage 31:1536–1548.
    
    Buzsáki, G., & Draguhn, A. (2004). Neuronal oscillations in cortical networks. Science,
    304(5679), 1926-1929.
    
    Fisher, N. I. (1995). Statistical analysis of circular data. cambridge university press.

    Glerean, E., Salmi, J., Lahnakoski, J. M., Jääskeläinen, I. P., & Sams, M. (2012). 
    Functional magnetic resonance imaging phase synchronization as a measure of dynamic 
    functional connectivity. Brain connectivity, 2(2), 91-101.
    
    Args:
        data: (pd.DataFrame, np.ndarray) observations x subjects data
        sampling_freq: (float) sampling freqency of data in Hz
        low_cut: (float) lower bound cutoff for high pass filter
        high_cut: (float) upper bound cutoff for low pass filter
        order: (int) filter order for butterworth bandpass
        
    Returns:
        dictionary with mean phase angle, vector length, and rayleigh statistic
        
    '''
    
    if not isinstance(data, (pd.DataFrame, np.ndarray)):
        raise ValueError('data must be a pandas dataframe or numpy array (observations by subjects)')
        
    phase = np.angle(hilbert(_butter_bandpass_filter(pd.DataFrame(data), low_cut, high_cut, sampling_freq, order=order)))
    
    out = {'average_angle':_phase_mean_angle(phase)}
    out['vector_length'] = _phase_vector_length(phase)
    out['p'] = _phase_rayleigh_p(phase)
    return out

def _hamming_bandpass_filter(data, low_cut, high_cut, fs, taps=5, window='hamming', mode='same', **kwargs):
    nyq = 0.5 * fs
    fir = firwin(ntaps, [low_cut, high_cut], nyq=nyq, pass_zero=False,
                  window=window, scale=False)
    return np.array([np.convolve(data[x], fir, mode=mode) for x in data]).T

def _butter_bandpass_filter(data, low_cut, high_cut, fs, axis = 0, order=5, **kwargs):
    '''Apply a bandpass butterworth filter'''
    nyq = 0.5 * fs
    b, a = butter(order, [low_cut/nyq, high_cut/nyq], btype='band')
    return filtfilt(b, a, data, axis=axis)

def _phase_mean_angle(phase_angles):
    '''Compute mean phase angle using circular statistics
    
        Can take 1D (observation for a single feature) or 2D (observation x feature) signals
        
        Implementation from:
        
            Fisher, N. I. (1995). Statistical analysis of circular data. cambridge university press.
            
        Args:
            phase_angles: (np.array) 1D or 2D array of phase angles

        Returns:
            mean phase angle: (np.array)

    '''
    
    axis = 0 if len(phase_angles.shape) == 1 else 1
    return np.arctan2(np.mean(np.cos(phase_angles), axis=axis), np.mean(np.sin(phase_angles), axis=axis))

def _phase_vector_length(phase_angles):
    '''Compute vector length of phase angles using circular statistics
    
        Can take 1D (observation for a single feature) or 2D (observation x feature) signals
        
        Implementation from:
        
            Fisher, N. I. (1995). Statistical analysis of circular data. cambridge university press.
            
        Args:
            phase_angles: (np.array) 1D or 2D array of phase angles

        Returns:
             phase angle vector length: (np.array)

    '''
    
    axis = 0 if len(phase_angles.shape) == 1 else 1
    return np.float32(np.sqrt(np.mean(np.cos(phase_angles), axis=axis)**2 + np.mean(np.sin(phase_angles), axis=axis)**2))

def _phase_rayleigh_p(phase_angles):
    '''Compute the p-value of the phase_angles using the Rayleigh statistic

        Implementation from:
        
            Fisher, N. I. (1995). Statistical analysis of circular data. cambridge university press.
            
        Args:
            phase_angles: (np.array) 1D or 2D array of phase angles

        Returns:
             p-values: (np.array)

    '''
    
    n = phase_angles.shape[0]
    Z = n*_phase_vector_length(phase_angles)**2
    if n <= 50:
        return np.exp(-1*Z)*(1 + (2*Z - Z**2)/(4*n) - (24*Z - 132*Z**2 +76*Z**3 - 9*Z**4)/(288*n**2))
    else:
        return np.exp(-1*Z)

lowcut = .04
highcut = .07

# lowcut = .01
# highcut = .03

tr = 1.5
roi = 35

sub_rois = {}
for sub in sub_timeseries:
    sub_rois[sub] = sub_timeseries[sub].iloc[:, roi]
sub_rois = pd.DataFrame(sub_rois)

stats = isps(sub_rois, low_cut=lowcut, high_cut=highcut, sampling_freq=1/tr)

f,a = plt.subplots(nrows=3, figsize=(15,4), sharex=True)
a[0].plot(stats['average_angle'])
a[1].plot(stats['vector_length'])
a[2].plot(stats['p'])
plt.tight_layout()
tr = 1.5
roi = 6

mask_x[roi].plot()

sub_rois = {}
for sub in sub_timeseries:
    sub_rois[sub] = sub_timeseries[sub].iloc[:, roi]
sub_rois = pd.DataFrame(sub_rois)

time_freq = {}
for i,cutoff in enumerate([(0.01, 0.027),(0.027, 0.073),(0.073, 0.198),(0.198, 0.25)]):
    print(cutoff)
    time_freq[i] = isps(sub_rois, low_cut=cutoff[0], high_cut=cutoff[1], sampling_freq=1/tr, order=5)['vector_length']
    
time_freq = pd.DataFrame(time_freq).T
f,a = plt.subplots(figsize=(30, time_freq.shape[0]))
sns.heatmap(time_freq, cmap='hot', ax=a, vmin=0, vmax=1)
(0.01, 0.027)
(0.027, 0.073)
(0.073, 0.198)
(0.198, 0.25)
<matplotlib.axes._subplots.AxesSubplot at 0x7fdd90fce8d0>
tr = 1.5
roi = 32

mask_x[roi].plot()

sub_rois = {}
for sub in sub_timeseries:
    sub_rois[sub] = sub_timeseries[sub].iloc[:, roi]
sub_rois = pd.DataFrame(sub_rois)

time_freq = {}
for i,cutoff in enumerate([(.005, .015),(0.015, 0.04),(0.04, 0.07),(0.07, 0.15), (.15, .25)]):
    print(cutoff)
    time_freq[i] = isps(sub_rois, low_cut=cutoff[0], high_cut=cutoff[1], sampling_freq=1/tr, order=5)['vector_length']
    
time_freq = pd.DataFrame(time_freq).T
f,a = plt.subplots(figsize=(30, time_freq.shape[0]))
sns.heatmap(time_freq, cmap='hot', ax=a, vmin=0, vmax=1)
(0.005, 0.015)
(0.015, 0.04)
(0.04, 0.07)
(0.07, 0.15)
(0.15, 0.25)
<matplotlib.axes._subplots.AxesSubplot at 0x7fdc8066d950>
phase_synchrony_brain.plot?
f,a = plt.subplots(nrows = 5, figsize=(15,10))
for i,cutoff in enumerate([(.005, .015),(0.015, 0.04),(0.04, 0.07),(0.07, 0.15), (.15, .25)]):
    synchrony = {}
    for roi in range(50):
        sub_rois = {}
        for sub in sub_timeseries:
            sub_rois[sub] = sub_timeseries[sub].iloc[:, roi]
            sub_rois = pd.DataFrame(sub_rois)
        synchrony[roi] = isps(sub_rois, low_cut=cutoff[0], high_cut=cutoff[1], sampling_freq=1/tr, order=5)['vector_length']`
    phase_synchrony_brain = roi_to_brain(pd.DataFrame(synchrony).mean(), mask_x)
    phase_synchrony_brain.plot(cmap='RdBu_r', vmax=1, axes=a[i], threshold=.25, title=f"Frequency cutoff: {cutoff[0]} - {cutoff[1]}")
    phase_synchrony_brain.plot(cmap='RdBu_r', vmax=1, axes=a[i])
from nilearn.plotting import plot_stat_map
plot_stat_map?
cutoff = (0.005, 0.01)
synchrony = {}
for roi in range(50):
    sub_rois = {}
    for sub in sub_timeseries:
        sub_rois[sub] = sub_timeseries[sub].iloc[:, roi]
        sub_rois = pd.DataFrame(sub_rois)
    synchrony[roi] = isps(sub_rois, low_cut=cutoff[0], high_cut=cutoff[1], sampling_freq=1/tr, order=5)['vector_length']
synchrony = pd.DataFrame(synchrony)

phase_synchrony_brain = roi_to_brain(pd.DataFrame(synchrony).mean(), mask_x)
phase_synchrony_brain.plot(cmap='RdBu_r', vmax=1)
# lowcut = .01
# highcut = .03

tr = 1.5
roi = 6

sub_rois = {}
for sub in sub_timeseries:
    sub_rois[sub] = sub_timeseries[sub].iloc[:, roi]
sub_rois = pd.DataFrame(sub_rois)


time_freq = {}
for freq in np.arange(0.001, (1/tr)/2, .025):
    if freq == .001:
        lowcut = freq
    else:
        highcut = freq
        print(lowcut, highcut)
        time_freq[highcut] = isps(sub_rois, low_cut=lowcut, high_cut=highcut, sampling_freq=1/tr, order=5)['vector_length']
        lowcut = highcut
time_freq = pd.DataFrame(time_freq).T
f,a = plt.subplots(figsize=(20, time_freq.shape[0]))
sns.heatmap(time_freq)
0.001 0.026000000000000002
0.026000000000000002 0.051000000000000004
0.051000000000000004 0.07600000000000001
0.07600000000000001 0.101
0.101 0.126
0.126 0.15100000000000002
0.15100000000000002 0.17600000000000002
0.17600000000000002 0.201
0.201 0.226
0.226 0.251
0.251 0.276
0.276 0.30100000000000005
0.30100000000000005 0.326
<matplotlib.axes._subplots.AxesSubplot at 0x7f82f3756e50>
phase_synchrony_brain = roi_to_brain(pd.DataFrame(phase_synchrony).mean(), mask_x)
view_img(phase_synchrony_brain.to_nifti())
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-293-3a8228984c91> in <module>
----> 1 phase_synchrony_brain = roi_to_brain(pd.DataFrame(phase_synchrony).mean(), mask_x)
      2 view_img(phase_synchrony_brain.to_nifti())

NameError: name 'phase_synchrony' is not defined
phase_synchrony_brain = roi_to_brain(pd.DataFrame(phase_synchrony).mean(), mask_x)
view_img(phase_synchrony_brain.to_nifti())
/Users/lukechang/anaconda3/lib/python3.7/site-packages/nilearn/reporting/html_document.py:60: UserWarning: It seems you have created more than 10 nilearn views. As each view uses dozens of megabytes of RAM, you might want to delete some of them.
  MAX_IMG_VIEWS_BEFORE_WARNING))
phase_synchrony_brain.plot(cmap='RdBu_r')
phase_synchrony_brain.plot(cmap='RdBu_r')
phase_synchrony_brain = roi_to_brain(pd.DataFrame(phase_synchrony).mean(), mask_x)
view_img(phase_synchrony_brain.to_nifti())
/Users/lukechang/anaconda3/lib/python3.7/site-packages/nilearn/reporting/html_document.py:60: UserWarning: It seems you have created more than 10 nilearn views. As each view uses dozens of megabytes of RAM, you might want to delete some of them.
  MAX_IMG_VIEWS_BEFORE_WARNING))
synchrony = {}
for roi in range(50):
    sub_rois = {}
    for sub in sub_timeseries:
        sub_rois[sub] = sub_timeseries[sub].iloc[:, roi]
        sub_rois = pd.DataFrame(sub_rois)

    stats = isps(sub_rois, low_cut=0.04, high_cut=0.07, filter_type='IIR')
    synchrony[roi] = stats['vector_length']
synchrony = pd.DataFrame(synchrony)

sync = Adjacency(1 - pairwise_distances(synchrony.T, metric='correlation'), matrix_type='similarity')
sync.plot(vmin=-1, vmax=1, cmap='RdBu_r')
t = sync.threshold(upper = .3, binarize=True)
t.plot()
plot_network(sync.threshold(upper = .3, binarize=True))
degree = pd.Series(dict(sync.threshold(upper=.2, binarize=True).to_graph().degree()))
brain_degree = roi_to_brain(degree, mask_x)
brain_degree.plot()
# def plot_network(data):
#     '''Plot the degree of the thresholded isfc Adjaceny matrix'''
    
#     if not isinstance(data, Adjacency):
#         raise ValueError('data must be an Adjacency instance.')

t = sync.threshold(lower = .1, binarize=True)
plt.figure(figsize=(20,15))
G = t.to_graph()
pos = nx.kamada_kawai_layout(G)
node_and_degree = G.degree()
nx.draw_networkx_edges(G, pos, width=3, alpha=.4)
nx.draw_networkx_labels(G, pos, font_size=14, font_color='darkslategray')

nx.draw_networkx_nodes(G, pos, nodelist=list(dict(node_and_degree).keys()),
                       node_size=[x[1]*100 for x in node_and_degree],
                       node_color=list(dict(node_and_degree).values()),
                       cmap=plt.cm.Reds_r, linewidths=2, edgecolors='darkslategray', alpha=1, with_labels=True)
    
<matplotlib.collections.PathCollection at 0x7f8d04e407d0>
synchrony = {}
for roi in range(50):
    sub_rois = {}
    for sub in sub_timeseries:
        sub_rois[sub] = sub_timeseries[sub].iloc[:, roi]
        sub_rois = pd.DataFrame(sub_rois)

    stats = isps(sub_rois, low_cut=0.01, high_cut=0.027) #(0.027, 0.073)
    synchrony[roi] = stats['vector_length']
synchrony = pd.DataFrame(synchrony)

sync2 = Adjacency(1 - pairwise_distances(synchrony.T, metric='correlation'), matrix_type='similarity')
sync2.plot(vmin=-1, vmax=1, cmap='RdBu_r')
(sync-sync2).plot(vmin=-.5, vmax=.5, cmap='RdBu_r')
(sync2-sync).plot(vmin=-.5, vmax=.5, cmap='RdBu_r')
brain_synchroany_mean = roi_to_brain(synchrony.mean(), mask_x)
sync = Adjacency(1 - pairwise_distances(synchrony.T, metric='correlation'), matrix_type='similarity')
sync.plot(vmin=-1, vmax=1)
brain_synchrony.write('/Users/lukechang/Downloads/phase_synchrony.nii.gz')
f,a = plt.subplots(figsize=(15,2))
synchrony.plot(ax=a)
<matplotlib.axes._subplots.AxesSubplot at 0x7f8cf0b70d50>

Recommended reading

  • Glerean, E., Salmi, J., Lahnakoski, J. M., Jääskeläinen, I. P., and Sams, M. (2012). Functional Magnetic Resonance Imaging Phase Synchronization as a Measure of Dynamic Functional Connectivity. Brain Connect. 2, 91–101. doi:10.1089/brain.2011.0068.

  • Nastase, S. A., Gazzola, V., Hasson, U., & Keysers, C. (2019). Measuring shared responses across subjects using intersubject correlation. Social Cognitive and Affective Neuroscience, Volume 14, Issue 6, June 2019, Pages 667–685, https://doi.org/10.1093/scan/nsz037

  • Nummenmaa, L., Lahnakoski, J. M., and Glerean, E. (2018). Sharing the social world via intersubject neural synchronisation. Curr. Opin. Psychol. 24. doi:10.1016/j.copsyc.2018.02.021.

  • Simony, E., Honey, C. J., Chen, J., Lositsky, O., Yeshurun, Y., Wiesel, A., et al. (2016). Dynamic reconfiguration of the default mode network during narrative comprehension. Nat. Commun. 7. doi:10.1038/ncomms12141.

Contributions

Juha Lahnakoski wrote text, Luke Chang wrote text and developed the practical tutorials.

Cut Text

Intersubject correlations give a summary statistic of synchrony over long periods of time, usually over an entire imaging session. However, the level of synchrony may change considerably from one moment to the next with the experimental condition. Time varying measures of synchrony will be discussed below, after we consider the topic of statistical inference on static intersubject correlation values.

From statistics section.

To illustrate these approaches, Video/Figure 1 below shows the process of creating these null distributions and compares these to a similar distribution built based on real resting state data of the same duration in the same participants recorded just prior to the movie data in the same imaging study. Resting state is an ideal condition for demonstrating the true null distribution of intersubject correlation as it involves no external synchronizing factors apart from the repeating noise of the scanner gradients, which are generally not of interest to us. Thus, any correlations in the resting state data arise by chance. As can be seen, the null distributions based on the surrogate data follow the distribution of resting state ISCs well as the number of iterations increases.

Figure 1: Distributions of pairwise ISC values in real movie data (blue), compared with null distributions based on temporally shifted (red) and phase scrambled (yellow) data, as well as real resting state data (purple). Top rows show the timecourse of a randomly selected voxel of one participant that have been shifted or scrambled in the case of the surrogate timecourses. Second row shows the timecourse of a second participant extracted from the same voxel. Bottom plot shows how the distributions evolve as the number of iterations increases. Dashed vertical lines correspond to the 95th percentile of each distribution. Note the distinctly extended right tail in the movie data, whereas other distributions appear highly consistent.

The number of participants, N, of the mean of pairwise ISC analysis greatly affects the null distribution. As the values are always below one, the standard deviation of the null distribution is reduced with the square of N (see Figure 2 below). This is true also for the resting state data and thus it does not affect the validity of the distributions. However, as a result, very small correlations become significant as N increases, which may not be desirable as interpretation of very weak correlations can be troublesome.

When the mean activity of participants is used the activity timecourse of a left-out participant, the effects of the sample size and variance of the ISC values is greatly reduced and group level statistics can be derived more similarly to traditional second level GLM analyses. For example, parametric t-statistics are largely consistent with those derived based on surrogate or resting state data.

Overall, the statistics for ISFC can be performed similarly to local ISC. However, it is important to note that the distribution of mean correlations between two different regions differs from within-region correlations. In particular, as the number of participant pairs increases, the distribution of ISC values is increasingly skewed to the right as negative correlations are eliminated nearly completely due to dependencies between the pairwise correlations. This is intuitive as any larger group of individuals who correlate negatively with other participants would often be positively correlated within the group, thus cancelling out any large negative values in the mean ISC. By contrast, when correlations are calculated between two different regions, negative values can occur if the stimulus-evoked activity timecourses in the two regions are consistently anticorrelated.

The distributions of ISCs and ISFCs in real and null data are depicted below as a function of the number of participants on a logarithmic color scale. Overall, the null distribution of the ISFC values appears slightly wider than the ISC null distribution. However, this is mainly driven by the increased number values in the correlation matrices for the ISFC (off-diagonal entries) than ISC (diagonal entries) as the shape of the distribution appears similar but the tail is truncated (see bottom right panel).

null_distribution

Figure 2: Distributions of real ISCs and ISFCs (top) and their null distributions (bottom) with different numbers of participants. Left-most panels show the ISCs and middle panels the ISFCs. Right-most panels show example distributions from the two left panels corresponding to the case with N=11 participants. The distributions are plotted on a logarithmic scale to make the differences in the left tails visible. Note that negative numbers exist in the real ISFC distribution while they are largely eliminated in the ISC.

Figure 1: Distributions of pairwise ISC values in real movie data (blue), compared with null distributions based on temporally shifted (red) and phase scrambled (yellow) data, as well as real resting state data (purple). Top rows show the timecourse of a randomly selected voxel of one participant that have been shifted or scrambled in the case of the surrogate timecourses. Second row shows the timecourse of a second participant extracted from the same voxel. Bottom plot shows how the distributions evolve as the number of iterations increases. Dashed vertical lines correspond to the 95th percentile of each distribution. Note the distinctly extended right tail in the movie data, whereas other distributions appear highly consistent."

Future work will evaluate the usefulness of these measures for elucidating individual variability and, for example, effects of psychiatric disorders on intersubject connectivity. However, our initial results presented in this conference suggest that stimulus-driven connectivity may reveal exciting insight into how functional network structure relevant for understanding naturalistic events is affected by social interaction disorders (see poster 206 and the accompanying talk in the Neurodevelopmental Disorders and Environmental Impact oral session).

  • LC: this felt a little out of place, any chance you have a link to the video? We could embed it in the ISFC section if you want.

To get an intuition of the temporal constituents of Pearson’s correlation coefficient between two signals, it can be written as the expected value of the elementwise product of the signals (standardized to zero mean and unit variance). This case can be seen in the top panel of video 2, where two independent sinusoidal signals are shown in blue and red, and their elementwise product is shown in gray dotted line. The correlation coefficient of the two signals is the expected value of the gray signal, that is, sum of the timepoints divided by N-1, where N is the number of timepoints. This timecourse gives direct insight to which timepoints are contributing most to the correlation. However, as can be seen from this simple example, the correlation coefficient tends to be dominated by high-amplitude timepoints; when one (or both) of the signals is close to zero the mean value, the product is also close to zero. By contrast, when both are far away from zero, the timepoints contribute more to the overall correlation, either negatively when the signals have opposite signs or positively when the signs are the same. The vertical histogram on the left shows a dynamically updating distribution of the elementwise products and the correlation coefficient as the number of timepoints increases. As can be seen, after a few periods of the signal, the correlation coefficient stabilizes near zero. Similarly, ISC can be derived by producing a similar timecourse of elementwise products for all pairs of subjects and averaging over the participant pairs (Lahnakoski et al., 2017). This timecourse directly produces the ISC values, but the dependence on the extreme amplitude values remains and the timecourse only shows the relative importance of the individual timepoints rather than a readily interpretable statistical measure of synchrony.

  • LC: this seems interesting and relevant. but I can't understand what you are saying enough to write the latex notation and create a corresponding figure.
  • Chen, J., Leong, Y. C., Honey, C. J., Yong, C. H., Norman, K. A., and Hasson, U. (2017). Shared memories reveal shared structure in neural activity across individuals. Nat. Neurosci. 20, 115–125. doi:10.1038/nn.4450.
  • Gratton, C., Laumann, T. O., Nielsen, A. N., Greene, D. J., Gordon, E. M., Gilmore, A. W., et al. (2018). Functional Brain Networks Are Dominated by Stable Group and Individual Factors, Not Cognitive or Daily Variation. Neuron 98, 439-452.e5. doi:10.1016/j.neuron.2018.03.035.

  • Hasson, U., Avidan, G., Gelbard, H., Vallines, I., Harel, M., Minshew, N., et al. (2009). Shared and idiosyncratic cortical activation patterns in autism revealed under continuous real-life viewing conditions. Autism Res. 2, 220–231. doi:10.1002/aur.89.

  • Hasson, U., Nir, Y., Levy, I., Fuhrmann, G., and Malach, R. (2004). Intersubject synchronization of cortical activity during natural vision. Science (80-. ). 303, 1634–40. doi:10.1126/science.1089506.

  • Hove, M. J., and Risen, J. L. (2009). It’s All in the Timing: Interpersonal Synchrony Increases Affiliation. Soc. Cogn. 27, 949–960. doi:10.1521/soco.2009.27.6.949.

  • Konvalinka, I., Xygalatas, D., Bulbulia, J., Schjødt, U., Jegindø, E. M., Wallot, S., et al. (2011). Synchronized arousal between performers and related spectators in a fire-walking ritual. Proc. Natl. Acad. Sci. U. S. A. 108, 8514–8519. doi:10.1073/pnas.1016955108.

  • Lahnakoski, J. M., Glerean, E., Jääskeläinen, I. P., Hyönä, J., Hari, R., Sams, M., et al. (2014). Synchronous brain activity across individuals underlies shared psychological perspectives. Neuroimage 100C, 316–324. doi:10.1016/j.neuroimage.2014.06.022.

  • Lahnakoski, J. M., Jääskeläinen, I. P., Sams, M., and Nummenmaa, L. (2017). Neural mechanisms for integrating consecutive and interleaved natural events. Hum. Brain Mapp. doi:10.1002/hbm.23591.

  • Mäntylä, T., Nummenmaa, L., Rikandi, E., Lindgren, M., Kieseppä, T., Hari, R., et al. (2018). Aberrant Cortical Integration in First-Episode Psychosis During Natural Audiovisual Processing. Biol. Psychiatry 84, 655–664. doi:10.1016/j.biopsych.2018.04.014.

  • Miles, L. K., Nind, L. K., and Macrae, C. N. (2009). The rhythm of rapport: Interpersonal synchrony and social perception. J. Exp. Soc. Psychol. 45, 585–589. doi:10.1016/j.jesp.2009.02.002.

  • Nummenmaa, L., Lahnakoski, J. M., and Glerean, E. (2018). Sharing the social world via intersubject neural synchronisation. Curr. Opin. Psychol. 24. doi:10.1016/j.copsyc.2018.02.021.

  • Parkinson, C., Kleinbaum, A. M., and Wheatley, T. (2018). Similar neural responses predict friendship. Nat. Commun. 9, 1–14. doi:10.1038/s41467-017-02722-7.

  • Salmi, J., Roine, U., Glerean, E., Lahnakoski, J., Nieminen-Von Wendt, T., Tani, P., et al. (2013). The brains of high functioning autistic individuals do not synchronize with those of others. NeuroImage Clin. 3. doi:10.1016/j.nicl.2013.10.011.

  • Simony, E., Honey, C. J., Chen, J., Lositsky, O., Yeshurun, Y., Wiesel, A., et al. (2016). Dynamic reconfiguration of the default mode network during narrative comprehension. Nat. Commun. 7. doi:10.1038/ncomms12141.